Using an Automatically Generated Dictionary and a Classifier to Identify a Person's Profession in Tweets
نویسندگان
چکیده
Algorithms for classifying pre-tagged person entities in tweets into one of 8 profession categories are presented. A classifier using a semi-supervised learning algorithm that takes into consideration the local context surrounding the entity in the tweet, hash tag information, and topic signature scores is described. A method that uses data from the Web to dynamically create a reference file called a person dictionary, which contains person/profession relationships, is described, as is an algorithm to use the dictionary to assign a person into one of the 8 profession categories. Results show that classifications made with the automated person dictionary compare favorably to classifications made using a manually compiled dictionary. Results also show that classifications made using either the dictionary or the classifier are moderately successful and that a hybrid method using both offers significant improvement.
منابع مشابه
Rice Classification and Quality Detection Based on Sparse Coding Technique
Classification of various rice types and determination of its quality is a major issue in the scientific and commercial fields associated with modern agriculture. In recent years, various image processing techniques are used to identify different types of agricultural products. There are also various color and texture-based features in order to achieve the desired results in this area. In this ...
متن کاملA High Capacity Email Steganography Scheme using Dictionary
The main objective of steganography is to conceal a secret message within a cover-media in such a way that only the original receiver can discern the presence of the hidden message. The cover-media can be a text, email, audio, image, and video, which can be transmitted through a public channel, such as the Internet. By extending the use of email among Internet users, the provision of email steg...
متن کاملBootstrapped Learning of Emotion Hashtags #hashtags4you
We present a bootstrapping algorithm to automatically learn hashtags that convey emotion. Using the bootstrapping framework, we learn lists of emotion hashtags from unlabeled tweets. Our approach starts with a small number of seed hashtags for each emotion, which we use to automatically label tweets as initial training data. We then train emotion classifiers and use them to identify and score c...
متن کاملFault Detection of Bearings Using a Rule-based Classifier Ensemble and Genetic Algorithm
This paper proposes a reduct construction method based on discernibility matrix simplification. The method works with genetic algorithm. To identify potential problems and prevent complete failure of bearings, a new method based on rule-based classifier ensemble is presented. Genetic algorithm is used for feature reduction. The generated rules of the reducts are used to build the candidate base...
متن کاملبهبود کارایی طبقهبندیکننده مبتنی بر نمایش تنک برای طبقهبندی سیگنالهای مغزی
In this paper, the problem of classification of motor imagery EEG signals using a sparse representation-based classifier is considered. Designing a powerful dictionary matrix, i.e. extracting proper features, is an important issue in such a classifier. Due to its high performance, the Common Spatial Patterns (CSP) algorithm is widely used for this purpose in the BCI systems. The main disadvanta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013